aleatoric uncertainty
- South America > Brazil (0.04)
- Europe > Netherlands > South Holland > Rotterdam (0.04)
- Asia > Japan (0.04)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.68)
JUCAL: Jointly Calibrating Aleatoric and Epistemic Uncertainty in Classification Tasks
Heiss, Jakob, Lambrecht, Sören, Weissteiner, Jakob, Wutte, Hanna, Žurič, Žan, Teichmann, Josef, Yu, Bin
We study post-calibration uncertainty for trained ensembles of classifiers. Specifically, we consider both aleatoric (label noise) and epistemic (model) uncertainty. Among the most popular and widely used calibration methods in classification are temperature scaling (i.e., pool-then-calibrate) and conformal methods. However, the main shortcoming of these calibration methods is that they do not balance the proportion of aleatoric and epistemic uncertainty. Not balancing these uncertainties can severely misrepresent predictive uncertainty, leading to overconfident predictions in some input regions while being underconfident in others. To address this shortcoming, we present a simple but powerful calibration algorithm Joint Uncertainty Calibration (JUCAL) that jointly calibrates aleatoric and epistemic uncertainty. JUCAL jointly calibrates two constants to weight and scale epistemic and aleatoric uncertainties by optimizing the negative log-likelihood (NLL) on the validation/calibration dataset. JUCAL can be applied to any trained ensemble of classifiers (e.g., transformers, CNNs, or tree-based methods), with minimal computational overhead, without requiring access to the models' internal parameters. We experimentally evaluate JUCAL on various text classification tasks, for ensembles of varying sizes and with different ensembling strategies. Our experiments show that JUCAL significantly outperforms SOTA calibration methods across all considered classification tasks, reducing NLL and predictive set size by up to 15% and 20%, respectively. Interestingly, even applying JUCAL to an ensemble of size 5 can outperform temperature-scaled ensembles of size up to 50 in terms of NLL and predictive set size, resulting in up to 10 times smaller inference costs. Thus, we propose JUCAL as a new go-to method for calibrating ensembles in classification.
- North America > United States (0.46)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Government (0.45)
- Health & Medicine (0.45)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
- (2 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- South America > Brazil (0.04)
- (2 more...)
Quantifying and Attributing Submodel Uncertainty in Stochastic Simulation Models and Digital Twins
Ghasemloo, Mohammadmahdi, Eckman, David J., Li, Yaxian
Stochastic simulation is widely used to study complex systems composed of various interconnected subprocesses, such as input processes, routing and control logic, optimization routines, and data-driven decision modules. In practice, these subprocesses may be inherently unknown or too computationally intensive to directly embed in the simulation model. Replacing these elements with estimated or learned approximations introduces a form of epistemic uncertainty that we refer to as submodel uncertainty. This paper investigates how submodel uncertainty affects the estimation of system performance metrics. We develop a framework for quantifying submodel uncertainty in stochastic simulation models and extend the framework to digital-twin settings, where simulation experiments are repeatedly conducted with the model initialized from observed system states. Building on approaches from input uncertainty analysis, we leverage bootstrapping and Bayesian model averaging to construct quantile-based confidence or credible intervals for key performance indicators. We propose a tree-based method that decomposes total output variability and attributes uncertainty to individual submodels in the form of importance scores. The proposed framework is model-agnostic and accommodates both parametric and nonparametric submodels under frequentist and Bayesian modeling paradigms. A synthetic numerical experiment and a more realistic digital-twin simulation of a contact center illustrate the importance of understanding how and how much individual submodels contribute to overall uncertainty.
- North America > United States > Texas > Brazos County > College Station (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Germany > Saarland > Saarbrücken (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- North America > Canada (0.04)
- Asia > Middle East > Jordan (0.04)
- (2 more...)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Oncology (0.67)
- Government (0.67)
- Information Technology (0.67)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- Information Technology > Data Science > Data Mining (0.67)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)
To Believe or Not to Believe Y our LLM: Iterative Prompting for Estimating Epistemic Uncertainty
We explore uncertainty quantification in large language models (LLMs), with the goal to identify when uncertainty in responses given a query is large. We simultaneously consider both epistemic and aleatoric uncertainties, where the former comes from the lack of knowledge about the ground truth (such as about facts or the language), and the latter comes from irreducible randomness (such as multiple possible answers).
- North America > United States (0.14)
- North America > Canada > Alberta (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- (5 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)